Goto

Collaborating Authors

 sd 1


OntheSimilaritybetweentheLaplace andNeuralTangentKernels

Neural Information Processing Systems

Finally, we provide experiments on real data comparing NTK and the Laplace kernel, along with a larger class ofγ-exponential kernels. We show that these perform almost identically.


SupplementaryMaterials

Neural Information Processing Systems

We first prove the direction Z T SI(Z;T) = 0, which is equivalent to prove I(Z;T) = 0 SI(Z;T) = 0. We prove the contrapositive, i.e. rather than show LHS = RHS, we show that RHS = LHS. Now assume that supwi,vj ρ(w i Z i,v j T j) > ϵ for some i,j. Then by setting those elements in w,v unrelated to Z i,T j to zero, and those related to Z i,T j exactlythesameaswi,vj,weknowthatsupw,vρ(w Z,v T) > ϵ. All neural networks are trained by Adam with its default settings and a learning rate η = 0.001. Early stopping is an useful technique for avoiding overfitting, however it needs to be carefully considered when applied to adversarial methods.


Generalized Sliced Wasserstein Distances

Soheil Kolouri, Kimia Nadjahi, Umut Simsekli, Roland Badeau, Gustavo Rohde

Neural Information Processing Systems

Inthis paper,wefirst clarify themathematical connection between the SW distance and the Radon transform. We then utilize the generalized Radon transform to define a new family of distances for probability measures, which we call generalized sliced-Wasserstein (GSW) distances.